查看原文
其他

空气质量指数

大NIU 地学分析与算法 2022-05-17
空气质量指数(air quality index,AQI)又称为空气污染指数,可以用来表征空气污染程度,是国际上普遍采用的定量评价空气质量好坏的重要指标。它是根据环境空气质量标准和各项污染物对人体健康、生态、环境的影响,将常规监测的几种空气污染物(如PM2.5、PM10、SO2、NO2、O3、CO等)浓度简化成为单一的概念性指数值形式。

目前很多网站都有提供可以查询的公开历史AQI数据,生态环境部也有全国空气质量实时发布平台,可以对空气质量进行实时监测,但目前鲜见可以快速下载的途径,今天分享一个爬虫实现,希望对从事空气质量研究的同学有所帮助。

1配置文件
#城市名CITYNAME = 北京CITY = beijingMONTH = 2013-12
# 起始日期 年月DATE_BEGIN = 2018/01
# 终止日期 年月DATE_END = 2018/02
# 保存路径SAVE_PATH = data/
2代码
class SpiderMain(object): def __init__(self, conf): self.conf = conf self.dates = manager.DateManager() self.downloader = html_downloader.HtmlDownloader(conf) self.parser = html_parser.HtmlParser(conf) self.output = html_output.HtmlOutput()
def craw(self): count = 1 type_ = conf.get("MAIN", "SPIDER_TYPE") df_dates = pd.date_range(start = conf.get(type_, "DATE_BEGIN"), end = conf.get(type_, "DATE_END"), freq = "M") df_dates = list(df_dates) df_dates.append(datetime.datetime.strptime(conf.get(type_, "DATE_END"), '%Y/%m'))
self.dates.add_new_dates(df_dates) while self.dates.has_new_date(): try: new_date = self.dates.get_new_date() print("craw %d: %s %s" % (count, conf.get(type_, "CITY"), new_date.strftime("%Y-%m"))) html_cont = self.downloader.download(new_date) #print html_cont new_data = self.parser.parse(new_date, html_cont)
if new_data is None: print("There are no data for this date") count = count + 1 continue #print new_urls self.output.collect_data(new_data) # if count == 3: # break
count = count + 1 except Exception as e: print(e); self.downloader.terminateBroswer() print("craw failed") sys.exit() self.downloader.terminateBroswer() self.output.save_excel(type_, conf.get("WEATHER", "SAVE_PATH") + type_ + "_" + conf.get(type_, "CITY") + ".csv")
if __name__ == '__main__': conf = configparser.ConfigParser() conf.read("conf.ini", encoding="utf-8-sig")
obj_spider = SpiderMain(conf) obj_spider.craw()

3爬取结果
,Date,AQI,GRADE,PM25,PM10,SO2,CO,NO2,O30,2018-01-01,57,良,34,63,9,1,44,381,2018-01-02,50,优,28,50,7,0.8,33,462,2018-01-03,28,优,11,28,5,0.4,21,513,2018-01-04,40,优,15,30,4,0.5,32,394,2018-01-05,63,良,32,54,8,0.9,50,365,2018-01-06,48,优,16,30,5,0.5,38,486,2018-01-07,54,良,38,57,10,0.8,43,507,2018-01-08,55,良,12,60,4,0.3,11,618,2018-01-09,35,优,7,35,3,0.3,13,639,2018-01-10,32,优,6,21,3,0.3,9,6310,2018-01-11,32,优,13,25,5,0.5,25,5511,2018-01-12,87,良,61,81,11,1.3,69,2112,2018-01-13,139,轻度污染,106,122,15,1.7,79,1713,2018-01-14,176,中度污染,133,137,12,1.7,71,4114,2018-01-15,59,良,32,58,8,0.9,47,4215,2018-01-16,114,轻度污染,45,177,5,0.7,42,5516,2018-01-17,77,良,42,104,8,1,49,6217,2018-01-18,93,良,63,105,11,1.2,74,1918,2018-01-19,108,轻度污染,81,122,13,1.5,73,3619,2018-01-20,63,良,31,75,9,0.8,48,3220,2018-01-21,64,良,37,77,19,1.2,44,2921,2018-01-22,47,优,22,47,4,0.6,25,6522,2018-01-23,32,优,11,32,4,0.4,15,6023,2018-01-24,35,优,17,33,5,0.4,28,6524,2018-01-25,33,优,9,23,4,0.4,26,6225,2018-01-26,53,良,29,41,8,0.6,42,4226,2018-01-27,112,轻度污染,84,95,14,1.4,67,1427,2018-01-28,59,良,11,67,3,0.4,15,6628,2018-01-29,56,良,8,61,3,0.4,20,6729,2018-01-30,36,优,11,36,6,0.4,24,6830,2018-01-31,62,良,22,74,7,0.5,29,6531,2018-02-01,68,良,30,85,10,0.7,36,6432,2018-02-02,34,优,8,30,5,0.3,11,6833,2018-02-03,32,优,8,21,5,0.4,20,6434,2018-02-04,40,优,19,36,8,0.5,32,5535,2018-02-05,34,优,7,18,3,0.3,15,6736,2018-02-06,65,良,47,61,13,0.9,51,4337,2018-02-07,51,良,33,52,8,0.6,31,7638,2018-02-08,73,良,53,76,12,0.9,53,4939,2018-02-09,80,良,40,109,8,0.7,32,7440,2018-02-10,44,优,10,44,3,0.3,12,7441,2018-02-11,67,良,9,84,2,0.3,6,7342,2018-02-12,53,良,12,56,4,0.4,15,7343,2018-02-13,79,良,58,79,13,1.1,57,5044,2018-02-14,37,优,13,33,3,0.4,18,7445,2018-02-15,73,良,53,66,10,0.5,22,8046,2018-02-16,107,轻度污染,80,101,14,0.6,24,8547,2018-02-17,109,轻度污染,82,92,10,1.3,35,4948,2018-02-18,152,中度污染,116,109,14,1.7,46,8049,2018-02-19,188,中度污染,141,131,24,2,42,6950,2018-02-20,63,良,45,42,6,0.7,21,8551,2018-02-21,48,优,33,43,6,0.6,26,8152,2018-02-22,45,优,18,45,6,0.5,29,7753,2018-02-23,52,良,23,54,6,0.6,29,7954,2018-02-24,62,良,30,74,10,0.6,21,6555,2018-02-25,82,良,60,71,10,0.8,43,6656,2018-02-26,162,中度污染,123,121,21,1.5,69,8657,2018-02-27,218,重度污染,168,170,12,1.8,51,6658,2018-02-28,115,轻度污染,87,116,9,1,34,744可视化一下

5说明

此处主要用到了Selenium、BeautifulSoup、pandas、scrapy等Python package。

后台回复“空气质量”获取源码,有疑问后台留言。


说说线性规划

动态烟花效果的实现

VOSviewer文献综述

泰勒图的MATLAB实现

Python爬取高德地图--瓦片图

Manner-Kendall(M-K)---突变检验

ArcPy批量定义投影和批量投影转换

机器人局部规划算法--DWA算法原理

ArcGIS时间滑块实现车辆轨迹动态展示

GPS数据处理---在野外采样寻点中的应用

世界各国GDP排名变化--Matlab动图实现

更多精彩推荐,敬请关注我们

您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存